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Abstract. Verification of software systems is a very hard problem due 
to the large size of program state-space. The traditional techniques (like 
model checking) do not scale; since they include the whole state-space by 
inlining the library function codes. Current research avoids these prob- 
lem by creating a lightweight representation of the library in form of 
an interface graph (call sequence graph). In this paper we introduce a 
new algorithm to compute a safe, permissive interface graph for C-type 
functions. In this modular analysis, each function transition is summa- 
rized following three-valued abstraction semantics. There are two kinds 
of abstraction used here. The global abstraction contains predicates over 
global variables only; however the local abstraction inside each function 
may also contain the local variables. The abstract summary needs re- 
finement to guarantee safety and permissiveness. We have implemented 
the algorithms in TICC tool and compared this algorithm with some re- 
lated interface generation algorithms. We also discuss the application of 
interface as an offline test-suite. We create an interface from the model 
program (specification) and the interface will act as a test-suite for the 
new implementation-under-test (IUT). 



1 Introduction 

Verification of software systems is a very hard problem due to the large size 
of program state-space. Most software programs contain library functions and 
these kind of functions are examples of open systems. The verification of such 
open systems becomes infeasible due to two main problems. Firstly, in order 
to verify a given program one needs to inline the library function code and it 
increases the space complexity of the verification algorithms. Current formal 
techniques like model-checking can not handle the large state-space generated 
from the program variables. The second option is to verify the library functions 
a priori so that there is no need to inline them. For this purpose, most of the 
time a small code containing a sequence of library functions calls(called client) 
is written. The client code invokes the library functions to close the open system. 
The library functions are impossible to verify in the absence of exhaustive client 
program. Hence most of the verification approaches plug-in a client code to close 
the open-system. 



1.1 Interface and Properties 



The current research [91113] avoids these two problems by applying modular ver- 
ification techniques which builds a small call sequence graph, called interface 
representing union of all client programs. The interface contains all possible call 
sequences which leads the library to error or illegal states. Similarly, the interface 
should contain all possible call sequences which avoids the error states. Hence- 
forth constrains on the use of the library function calls from outside and the user 
can distinguish the legal call sequences from the illegal ones by simply looking at 
the interface. There are two immediate benefits of using the interfaces. Firstly, 
these interfaces are light-weight representation of the libraries and the imple- 
mentation of the library functions can be replaced by the interface. Secondly, 
the interfaces can be constructed without the help of any client program. The 
interface should be safe i.e. all illegal call sequences (which leads the library to 
the error states) will be present in the interface. The interface graph should be 
permissive i.e. all legal sequences will be present in the interface. 

1.2 Related Work 

However, there are some challenges in building succinct interfaces. The inter- 
face size can become exponential in terms of number of variables. A symbolic 
representation and abstraction techniques partition the state-space into a small 
number of regions where every region represents one node of the interface graph. 
Some researches apply these abstraction and symbolic techniques to obtain a 
small but safe and permissive interface. 

The work by Alur et. al. ( pQ) uses Angluin's learning algorithm L* to create 
an interface. The algorithm learns the interface language by asking membership 
and equivalence queries to teacher (here program) . The generated interface is safe 
and minimal; but not permissive. To handle big case studies predicate abstraction 
has been used, however the user need to provide the predicates. There is no 
automatic abstraction refinement. The algorithm returns minimal size interface 
if the algorithm is not hit by timeout. Experimental results show that even in 
small examples timeout occurs. The CEGAR approach by Henzinger et. al. ( [9]) 
creates a safe and permissive interface. The size of the interface can be big enough 
depending on the chosen counter-example. The direct approach by Beyer et. al. 
( [3j) creates an interface which is safe and permissive. This approach does not 
use abstraction and hence the interface can become very large. 

1.3 Contribution 

Unlike the related work, our work can also be used in unstructured or non-object 
oriented (C style) functions. In an object-oriented framework every class variable 
is accessible to every class method and can be a global variable to the class 
method. Instead we assume that each function may contain several local variables 
in addition to those global variables. Hence, we have more general platform 
to compute interface. Each of these functions can also have several sequential 
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updates of variables, call to other functions even recursive calls to themselves. 
However, we compute the interface including only functions accessible to the 
user level. 

In the first stage of three stage algorithm, every C library function is parsed 
by CIL (C Intermediate Language) [TT] and converted into TICC [I] input lan- 
guage. This language syntax is similar to the guarded-update language. We have 
implemented the next two stages in this Multi-valued Decision Diagram |10j - 
based symbolic tool TICC. The second stage computes the transition summary 
of each function. This modular algorithm handles each function separately in- 
cluding local variables within the scope. However, the space complexity of func- 
tion summary becomes a bottleneck in order to compute big functions which 
may contain large number of guarded-updates. Hence, we employ three valued 
abstraction refinement schemes in addition to symbolic techniques. The abstrac- 
tion in summarization ensures small size; whereas successive refinement of the 
abstract states fine tune the abstraction to obtain the safety and permissiveness. 
In the last stage, an interface graph is built from the abstract set of states. We 
show different stages of building a symbolic safe and permissive interface in the 
following example. 



Example 1 (Motivating Example). Figure 1(a) defines a stack data-type stackT 
and two functions push and pop. The data type stackT has an array of integers 
el of size MAX and an integer showing the top of the stack. The function pop 
returns error when the stack is empty i.e. top is zero. The function push returns 
error if the top is equal to A1AX. Otherwise copies the input value sd into the 



el array at address top. The top is incremented later. Figure 1(b) shows how 
the C code is converted into guarded-update rule in the next stage. The global 
variable err denotes the error in the library and the library goes to error state 



when err is set to 1. Figure 1(c) shows the interface graph from the set of rules. 
The initial state of the interface graph is state 1 where the stack is empty. A call 
to pop function from the initial state will move the library into an ERROR state. 
Similarly calling push form state 3 will be an error due to full stack. We can 
note that the interface can create many legal as well as illegal sequences of stack 
functions. To check each of them we otherwise need a set of client programs. 

Finally we discuss the applications of the safe and permissive interface graph. 
Firstly, any given client program can immediately verify with the help of the in- 
terface graph whether the function call sequence in the client leads the library 
to some error states. Secondly, the interface can actually provide an offline test- 
suite for a set of functions. Often the source of the library is unknown; however 
one can create a model program from the available documentation of the func- 
tions. The interface graph obtained from the model program can be used to test 
the implementation- under-test (HIT). 



2 Preliminary Definitions 

In this section we provide preliminary definitions and the background work. 
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#defme MAX 3 
typedef .struct { 
int el[MAX] // array based 
int top // range : to MAX 
} stackT; 



var sd, top : [0..3J 
varelJ),eLl,el_2:[0..3] 
var err : L0..1] 



void Pop(stackT * st)( 
if(sttop = 0){ 
fprintf (stderr, "stack empty"): 
exit(l); 



module pop: 
vars : [0-1] 
initial : s =0 
output popl: ( 




s - & top > --> s' - I & top' = top -1 ; 
s = & top = ==> s' - I & err' = 1 ; 



St. top — st. top — 1 



endmodule 



top 



void Push (stackT * st, int sd){ 
if(st.top = MAX){ 
[pi iinflstderr, "stack full"); 
exit(l); 



st.elftop] = sd; 
st. top — st. top + 1; 



module push: 
vars : [0..1] 
initial : s - 
output pushl:{ 

s-0 & top - --> s'-l & el_G" = sd & top' - top + 
s=0 & top - 1 --> s'-l & el_l" = sd & lop' = top +: 
s=0 & top >= 2 ==> s' = l & err' = 1; 



endmodule 



ERROR 



(a) Code 



(b) Rules 



(c) Rules 



Fig. 1. Stack Example 



2.1 A Transition System Model for Libraries 

A software library module Lib = (Fq,Vq, E, I) contains a set of functions Fq 
and a set of global variables Vq. The global variables Vq constitute variables 
declared outside any of the functions in Fq. The global state space Sq can be 
defined with respect to different valuations of global variables Vq. The variable 
err £ Vq is a special global variable in Lib which can take two values and 1. 
The library reaches an error set E C Sq when the global variable err is set to 
1. Moreover, the error set is a sink set of the library. The initial configuration of 
the library is given by set / C Sq- 

Each function / £ Fq also contains a set of local variables V[ . The scope of 
any local variable v £ V[ is function /. There is a special local variable, called 
s, in VI which corresponds to the relative location in the function with respect 
to the first location. For a function /, all variables V? can be given as V[ U Vq 
and function state-space Sf can be defined with respect to different valuations 
V? . We note that each global set sq £ Sq is a non-empty subset of sq C Sf 
function state-space. The initial local state set P L C denotes the entry point 
to the function /. All variables of the library Lib is denoted by V and is given 
by V:=V G UU /eJ?G V/. The total state -space S can be defined with respect to 
different valuations of all variables V. 

Each function / 6 F contains some number (say k) of guarded-update rules. 
For i-th such rule, its condition part i. guard C Sf can be given as a set of 
function states, and the assignment part i.update C Sf x Sf can be given as the 
set of transitions. For a set X C Sf, i.update(X) : Sf denotes the next state of 
X in the i — th update rule. The conditional transition of rule i given as 

i.trans:={(si, S2) £ Sf x Sf | s± € i. guard, S2 £ i.update(i. guard)}. 
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The transition relation Trans? C Sf x Sf can be given as the union of rules 
corresponding to the function / i.e. Trans? := U; = i...fc i.trans. We will use 
Transit) C Sf to denote the successor set of state t e Sf. 

For a binary relation xe {=,<,>} and a state-space S, the set S" \ VKa 
denotes the set where the value of a variable v related to value a with relation 
n. For a set X C 5/, we define support(X) C V/ as the set of variables whose 
value change result in a value change of X . Formally we can write, 

support(X) := V f \{v€ V f | Vs, s' € S f .s = v s' s € X s' e X} 

where s — v s' implies that s — s' except for a variable v <E . Interface graph 
is an input-enabled interface automata. Given a Library Lib = (Fg,Vg, E, 1) 
and global state-space Sg, we can define interface- graph or call sequence graph 
as IG = (N, T, T e , In, Er) where, 

— the nodes N C 2 2 G correspond to the set of states, 

— the set In C N denotes the initial nodes corresponding to /, 

— the set Er C N denotes the error nodes corresponding to E, 

— the set T C N x F G x (N \ Er) denotes good transitions. 

— the set T e C N x Fq x Er denotes erroneous transitions. 

2.2 Three Valued Abstraction 

For a library L — (F Gi Vg), a function / e F G and a function state-space Sf, 
an abstraction R C 2 2 is defined such that each abstract state (or region) 
r € i? is a non-empty subset r C S/ of concrete states. We require |Ji? = 5/. 
For subsets T C Sf and U C R, we write: 

Ui = iJueu u Tt% = {reR\mT^<b} = {r e R \ r C T} 

Thus, for a set E7 C i? of abstract states, U\. is the corresponding set of concrete 
states. For a set T C i? of concrete states, Tfj^ and Tfif are the set of abstract 
states that constitute over and under-approximations of the concrete set T. We 
say that the abstraction R of a state-space Sf is precise for a set T C 5/ of 
states if TfS = ^ta • 

2.3 /^-Calculus 

We will express our algorithms for solving reachability on the function state 
space in /z-calculus notation [S]. Consider a procedure 7 : 2 y/ M> 2 y , monotone 
when 2 yf is considered as a lattice with the usual subset ordering. We denote by 
\iZrf(Z) (resp. vZ.j(Z)) the least (resp. greatest) fix-point of 7, that is, the least 
(resp. greatest) set Z C V such that Z = j(Z). As is well known, since V is finite, 
these fix-points can be computed via Picard iteration: [iZ.j(Z) = lim„_ i . 00 7 n (0) 
and uZ.j(Z) = lim^co 7 "(7). 
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2.4 Predecessor Operators 

For a library function / and a function state-space Sf, we define the one-step 
predecessor operator Pre?' 1 : 2 Sf i-> 2 Sf as follows, for all Y C Sf. 

Pre fJ (Y) = {x e S f \ Trans f (i)n7 / 0} (1) 

We define the multi-step predecessor operator Pre*'* : 2 3 f y-> 2 s ' as follows, for 
aliyc5/: 

Pre f '*(Y) = {se5/|sn (fiX.(Y U Pre f ' 1 (X))) ^ 0} (2) 

Intuitively, the set Pre*'* (X) consists a subset of Sf from which one can reach 
to X by applying zero or more transitions within the function / by applying rules 
one after another. 

For the abstract state space R, we introduce abstract versions of Pre f ' R . As 
multiple concrete states may correspond to the same abstract state, we cannot 

f R 

compute, on the abstract state space, a precise analogous of Pre. . We define 
two abstract operators: the may operator Prel[ R : 2 R i-> 2 R , which constitutes 
an over-approximation of Pre? , and the must operator Pre^ R : 2 R >— > 2 R , which 
constitutes an under-approximation of Pre? [6]. We let, for U C R: 

Pref m R {U) = Pre f '*(UmR Pre{ R {U) = Pre^(Ui)^. (3) 

The fact that Pre(^ R and Pre^ R are over and under-approximations of the 
predecessor operator is made precise by the following observation: for all U C R 
we have 

Pre f M R (U)l C Pre f '*(Ui) C Pre^ R {U)i (4) 

. For an integer k > 1 and function state-space St, we recursively define the 
k-step post operator Post*' k : 2 Sf H> 2 Sf as follows, for all X C Sf: 

Post^iX) = U xeX Trans 5 (x) (5) 
Post f ' k (X) = Trans* {Post*- 1 * {X)) (6) 

For an abstract state space R C 2 2 1 , we define the abstract post operator 
Postlf : 2 R ^ 2 R as follows, for all X C R: 

Post{f{X) ={r€R\m Post f ' k (l[ n (X±)) ^ 0} (7) 

where k is the smallest integer to satisfy Post^ k+1 (l[ n (^4-)) = 0- Intuitively, 
the condition implies that no new states are added in the fc + l-th iteration, hence 
the last updated value when / returns can be obtained by applying Post* ,k to 
a subset of XI corresponding to the function's initial state set l[. 
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3 Translation from C to Guard-Update Rules 



In this section we discuss our procedure to convert C functions into the "sociable 
interface automata" [5] format. This format is contains several guarded-update 
rules and is the input format of our symbolic tool TICC. In our work the front- 
end and back-end are separate. Hence one only need a different front-end to 
parse functions from any other language (like Java/C++) to generate the TICC 
input format models. The next stages of the algorithm can reuse the out tool 
TICC to build interface graphs. 

The C functions are fed into CIL[TT] tool which parses C source code and 
returns the control flow graph. The control flow graph contains block structure 
as nodes and the conditions as the transitions. We have modified the control 
flow graph for each function into set of guarded-update rules. The conditions 
are represented as guards and the assignments are represented as updates. The 
special local variable s defines the location of current block. For a variable v, the 
primed variable v' denotes the v in the next sequential step. When the translator 
encounters a critical error condition (e.g. call to exit(l)) in the control flow graph; 
the global variable err is set to 1 in the translated library. 



- Control Flow Structures: The C source like "if (a =0) {b=0;} else {b=l;}" 
is converted into the following rules: 

a = 0,s = ==> b' = 0,s' = 1; 
a! = 0,s = ==> b' = l,s' = 1 



The switch and loop (like while, for) structures can be handled similarly. 
Variables and Data Structures: Currently the algorithm supports unsigned 
integers with small number (e.g. 4) of bits. The fixed-size arrays and struc- 
tures are flattened in the translation process. In the Integer Stack example 
in Figure 1(b) shows how an array of size 3 is translated as 3 integer vari- 
ables. The structure elements are also flattened in the example. Currently 
our translation does not directly handle pointers and recursive data types. 
However we can manually translate the pointers into integers only if we 
know that the control flow of the function does not depend on the value at 
its pointer location. 

Function Calls: Currently in order to compute the abstract transition for 
function /, we inline all the intermediate function call inside the body of /. 
In the guarded-update rule semantics, the rules of the intermediate functions 
are explicitly added to the rules of /. An explicit stack data structure is 
added to store the return address and the context variables. This trick can 
be applied to one function calling another function as well as the non-tail 
recursive function calls. The tail-recursive function calls can be converted 
into loops and do not need the stack. In the Appendix, we show a complete 
translation of a recursive c function. 
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4 Algorithm 



In this section we assume that the C functions are already parsed by CIL and 
modified into a software library module Lib — (Fg,Vg, E, I). We describe the 
basic algorithms for abstract refinement and building interface from a given 
library Lib. We also provide some implementation specific optimizations. 

4.1 Basic Algorithm 

Algorithm [I] computes the interface for library Lib — (Fq,Vg, E, I). The algo- 
rithm takes as input the library Lib, a set of functions F C Fq, an abstraction 
R. The first abstraction is obtained from the error set E and initial set / . 
Let us define ri = {s € Sq \ s € E}, r 2 = {s e S G \ s g E, s £ 1} and 
r 3 = {s £ Sg I s £ E, s g I}. For i £ {1, 2, 3}, if is non-empty, then we add 
the set to R as one of the initial abstract states. The algorithm [T] calls AbsRcf 
for every function / £ F separately to obtain a refined abstraction R w.r.t. the 
function. The procedure Buildlnterface returns an interface graph IG given 
the set of abstract states. 



Algorithm 1 Explore(Lz&, F, R) 

Input: a library Lib = (Fa, Va, E, I), set of functions F, abstraction R 
Output: Interface Graph IG 

1. for each / € F do R:= AbsRef (R, f, E) end for 
5. IG := Buildlnterface(7?, F, Lib) 



Modular Verification : Each function is considered separately in AbsRef (Algo- 
rithm^. Since, the interface graph is an input-enabled interface automata, every 
abstract state in the function can be checked separately for error reachability 
in one step function transition. The algorithm starts with the initial abstrac- 
tion R and the set of useful variables V a bs ar e obtained from the support set 
of the abstract states. The local abstraction Rf and global abstraction Rq are 
initialized with R. The must abstraction transition is computed with respect to 
Rf and we compute the must predecessor Sm of the error set E. The set Sm 
determines the set of states of the function which eventually reach the error set 
E. The set is subset of Sm corresponding to the initial set of states of the 
function. One-step concrete pre-image S 1 of Sm-1 checks whether any new states 
can be added to SmI- If S 1 \ SmI is non-empty then the local abstraction Rf is 
refined and the loop continues. Otherwise the global abstraction Rq is refined 
with respect to S M . The local and global refinements are described in the next 
paragraph. The algorithm terminates when each abstract state can either reach 
E or can not reach E in one function step. 
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Algorithm 2 AbsRef(i?, /, E) 



Input: Abstraction R, function /, error set E 
Output: updated R 

1- Vabs ■= UreRSupport(r), Rf—R 

2. loop 

3. S M ~ Pre ! M Rf (E) ; S f M := S M n 

4. S 1 := Pre s ^{S M \) 

S 1 \ (SmI) 

6. if s n ew := then Rc'.—R 

7. for each r G R do 

8. if (rnS4)#0& (r\S^)^0 

9. Rg~Rg U {ri,r 2 }\ {r}, where n := (rnS{,) and r 2 := (r \ 5^) 
8. return 7?g 

7. else 

8. split including a variable v from {u € (V* \ V a ts) j w G support(s new )} 

10. Abstraction J?/ is refined for all valuations of v 

11. end if 



Automatic Refinement : For refinement of the local abstraction Rf, the algo- 
rithm finds a variable v € V? which is not in the set V a bs and is in the support 
set of S} n \ SmI- The variable is added to the significant set V a bs and a new ab- 
straction Rf is obtained with respect to different valuations of v. The refinement 
of global abstraction Rq happens after the local abstraction reaches a fix-point 
and no new states can be added in the Sm set. For each abstract state r G Rq 
have a non-empty intersection with both S{ 4 and -iS^, then it is split into two 
states ri and r 2 . 



Algorithm 3 Buildlntcrface(i?, F, Lib) 

Input: Abstraction R, a set of functions F, a library Lib = (Fg, Vg, E, I) 
Output: Interface Graph IG = (N, T, T e , In, Er) 

1. Q,N,T,T e ,In,Er = 

2. append(Q, I); append(N, I U E); append(In, I); append(Er, E) 

3. while Q is non-empty do 

4. curr := removeFirst(Q) 

5. for each / G F do 

6. next := Postl^ R (curr) 

7. if ( not member(N, next)) then append (Q, next); append (N,next) endif 

8. if (next C E) then T e := T e U (curr, /, Er) else T := T U (curr, /, ne;ri)endif 

9. end for 
lOend while 
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Building Interface : Algorithm [3] computes the interface graph from the abstrac- 
tion R. For the algorithm, a list Q is maintained, the procedure append(Q,X) 
adds each element xelat the end of Q. The procedure member(Q, x) check if 
a; is a member of Q. The procedure removeFirst(Q) removes the first element 
from Q and returns the element. The algorithm computes the next symbolic 
state for each element in Q by applying Post^ operator. There is an error- 
edge from the current state curr to the error state Er when the next state of 
curr is a part of error set E. Otherwise appends the next state Q and a new 
good edge (curr, f, next) is added. The algorithm terminates when the list Q is 
empty. 

Example 2. To illustrate the algorithms defined before, let us revisit the Integer 
Stack example (Figure^. We assume that the guarded-update rules (Figure 1(b)) 



are converted into a library model with the set of functions {pop, push}. Let 
us denote the state-space as S . Figure illustrates the run of the explore al- 
gorithm(Algorithm^). The initial abstract states ro> r% and r2 partitions the 
state-space S into three regions (Figure^a)), where ro = S \ er r=i corresponds 
to error states, n — S \ e rr=a,top=o corresponds to the initial states without er- 
ror states, r2 = S \ err =o.top>o corresponds to the non-initial non-error states. 
AbsRef (Algorithm^ is invoked for pop function, the significant variables are 



top =0 top>0 



top =0 t0 P>° 



err = 


rl 


r2 


err = 1 


rO 



rlO 


rll 


r20 


r21 


rOO 


rOl 



top =0 top= 1 



top=2 



rl 


r20 


r21 


rO 



(a) 



(b) 



(c) 



Fig. 2. Run of the algorithm Explore on IntStack Example, (a) The initial ab- 
straction (b) The local abstraction inside function (c) The final global abstrac- 
tion. 



V a bs '■= {err, top}. In the first iteration, the must predecessor Sm of error state 
ro fail to add any new states. However, one step concrete predecessor of set Sm 
returns a set S 1 corresponding to S \ pop .s=o,top=o,err=o, where pop.s is the local 
variable s at function pop. The support set of S 1 \ Sm contains a new variable 
pop.s which is in V' , but not in V a bs- The local refinement of Rf adds different 
valuations of local variable pop.s (Figure^b)). The second digit of each abstract 
states denotes the value of pop.s in the abstract state. In the next iteration the 
must predecessor S m becomes {rlO, rOO, rOl} and no new concrete states can be 
added by one step predecessor of set Sm- Hence the local abstraction Rf can not 
be further refined. The local refinement at Figure ^b ) can not be returned as as 
the locally added variable pop.s can not reach outside the scope of function pop . 
The global set which leads the error set can be given by S M which is a subset 
of Sm corresponding to local initial state of the pop function i.e. S | pO p.s=0- 
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Hence the final global abstraction Rq for pop function is obtained from the initial 
global abstraction R of the function and will be refined with respect to set S M and 
its compliment set. The algorithm returns with an unchanged global abstraction. 

Similarly for the push function the local variable push.s is included in the 
local abstraction. Even if no new global variable is added in the refinement, there 
is a new refinement of the global abstract set r 2 with respect to the set of states 
(where top is 2 and err is 0) which reaches error states in one push call. The 
final global abstraction is shown in Figure^c). The build interface algorithm 
(Algori thm starts with the initial state r\ and adds the edges in the graph 
(Figure l(c)\ until every node is explored with respect to all functions. 



The interface generated by Explore algorithm is safe and permissive by con- 
struction. The safety in ensured by AbsRef Algorithm and permissiveness is en- 
sured by Buildlnterface algorithm. The final abstraction R after calling AbsRef 
algorithms for each function / G F distinguishes error reaching regions from the 
non-reaching ones. In Buildlnterface algorithm each function / is applied in each 
of the states in the graph obtained by the abstraction R and hence all behaviors 
are captured in the interface graph. 

Theorem 1. Explore (Algorithm^ returns a safe and permissive interface. 



4.2 Implementation Optimizations 

Approximate Abstract Function Summary and Predecessors: For practical pur- 
poses, we do not compute the abstract predecessor operators on the monolithic 
transition relations. Like [7], Equation [4] holds for approximate operators. The 
transition for a function / G Fq is represented as a number (say k) of guarded- 
update rules. For an abstraction R C 2 2 f , the must and may abstraction of rule 
i G {1, . . . , k} can be given as follows: 

i.transll^ := {(ri,r 2 ) G (R X R) | T\ G i.guard^, r 2 G i.update{ri\)\^ } 
i.trans^; R _ := {(ri,r 2 ) G (R X R) | r\ G i.guardffj , r 2 G i.update^il)^} 

For all j G {m+,M— }, X C 2 , the approximate transition relation, one step 
predecessor operator and multi-step predecessor operator can be given respec- 
tively as: 

TranSj' R := i.tranSj' R 

i=l.. .k 

Pre^iX) := {r G R \ Trans f j' R {r) fll^} 
PreJ ,jR (X) := {r G R | r H (/jY.(X U Pref RA (Y))) ^ 0} 



. For disjunctive transition relation, the approximate may predecessor opera- 
tor will be precise; however, the approximate must predecessor will be under- 
approximation of the precise one. 
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Theorem 2. For each f e F, R C 2 2 f , and X C 2 fl 7 we /icrae 

Pre{£(X)| C Pre'-* (XL) C Pre^(X)|. 

Incremental Building of Interface: Algorithm [T] can be used for incremental ad- 
dition of function sets; as we may not need to create the interface for all the 
functions at first. The algorithm returns the refined interface for the included 
functions only. The created interface can be used if we want to add more func- 
tions from the library. 

Rule Partition for Function One more optimization will be partitioning the 
rule set of each function with respect to the abstraction to create less splitting. 
Computation of each individual rule for must abstraction can create huge under- 
approximation; hence may need more splitting. 

Example 3. In presence of If- Then-Else or Switch constructs in the source code, 
we may encounter the following rules after the translation. 

7*i : hd = true ==> indata' = 0; hd' — false 
r2 '■ hd — false ==> indata' = 0; hd = hd 

The abstract set R is defined with respect to different valuations of indata vari- 
able. If we consider each rule separately and apply the must abstraction, we miss 
the fact that the final value of variable indata will be and does not depend on 
the initial value of hd. The must predecessor of 5 \i n data=o will be for both rules 
since the must abstraction of guards will be empty-set. However, if we combine 
two rules by taking union of sets, then the must predecessor of 5* \i n data=o will 
be S for the combined rule and there will not be any further splitting. 

The heuristic of rule set partition is obtained from the abstraction itself. If a 
function / has k rules, then z-th and j-th rules can be grouped together for an 
abstraction R if the condition i.guardXii = j-guard^ holds. 



5 Results 



In this section we will provide results of some case studies and compare with the 
related works. 

Data Stream Case Study There is a data stream with a header of length 2 h 
and data of length 2 d where h < d. The program uses d bits to represent the 
pointer and 1 bit for the "error". The boolean variable isHeader is 1 when in 
header and is otherwise. There are four functions in the program. The function 
FirstHeader and FirstData takes the pointer to the first header and data 
location respectively. The function Next moves the pointer within the header or 
data in a cyclic way. The function Write results in an error when pointer points 



to header section. Our algorithm produces the interface shown in Figure 3(a) 
The state 1 represents that the pointer in the data part and the state 2 represents 
that the pointer in the header part. 
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(a) Data Stream (b) Bit-Array-Manipulator 



Fig. 3. Interfaces 



Bit Array Manipulator The Bit Array Manipulator has four functions : prev , 
next, access and modify. Two global variables ptr of length 2 k specify the current 
location of the pointer. The global Boolean variable valid denotes whether the 
pointer is valid. Another Boolean variable err specify the library error states. 
The functions next and prev respectively increments and decrements the current 
pointer and set the valid flag to true. The functions access resets the valid flag. 
The function modify return sets err to true when the valid is false, otherwise 
sets valid to false. Our algorithm produces the interface shown in Figure |3(b)| 
The state 1 represents that the valid bit is false and the state 2 represents that 
the valid bit is true. 



Case Study 


Params 


Time (ms) Regions 


Direct Learning CEGAR 


Data Stream 


h = 2,d = 12 
h = 4,d = 12 
h = 13, d = 13 


3 2 

4 2 
18 2 


1028 2 257 
4112 2 257 
16384 2 2 


Bit Array 
Manipulator 


k = 8 
k = 9 
k = 16 


2 2 
4 2 
8 2 


68 2 2 
130 2 2 
16386 Timeout 2 



Fig. 4. Results 



Comparison Figure [4] shows a comparison of our algorithm with the related 
work on these two examples. The first two columns show the name and different 
parameter values of the case-studies. The next column describes the running 
time (in milli seconds) of explore algorithm from the parsed guarded-update 
rules. The next column represent the number of non-error regions in the inter- 
face graph. The non-error regions from other three related work are given in the 
last three columns and the data is obtained from Beyer et. al.'s. work [2]. The 



13 



results for Direct algorithm show that direct algorithm runs fastest, but the size 
of interface graph is exponential in d. We obtain that the CEGAR algorithm 
provides minimal graph only when h = d in the Data Stream example. The size 
of the graph in the CEGAR algorithm depends on the proper representation of 
variables with Boolean variables. The CEGAR approach refine by adding a new 
boolean variable; which has a risk of splitting many abstract states unnecessarily. 
In contrast, our algorithm keeps global abstraction separate from local abstrac- 
tion inside the function and refines the global abstraction lazily with respect to 
the final reachable set (S^j). Learning algorithm provides the minimal graph, 
but slowest of all three approaches. Our algorithm provides the same number of 
non-error regions as the learning algorithm. However, we can not compare time 
due to different platforms. 



6 Application of Interfaces 



In this section, we show how a safe and permissive interface can be useful in the 
verification and testing of the software programs. The following section briefly 
describe the modifications needed for the interface to be compatible with these 
settings. 



6.1 Software Verification with Interfaces 

Let us assume that we have computed an interface graph for a set of functions. 
Given a client program consisting of those functions one can immediately check 
the client with respect to the interface graph. The idea would be simulating 
the actions of the client program into the interface graph and check whether 
the library error state (State "ERROR") is reached. For example, a client with 
a single line modify(b) on the BitArrayManipulator b can be simulated in the 
interface graph (Figure [3(b)] ). We can see that the error state ERROR is reached 



from the initial state (State 1). There could be an infinite number of possible 
clients corresponding to those functions and each of them can be model-checked 
after the interface is computed. 



6.2 Offline Test Case Generation 

In the model-based testing paradigm, an implementation under test (HIT) is 
checked with respect to a given model program (a specification of the IUT) . Our 
algorithm can build an interface graph from the definitions of the functions given 
in the model program. We can create a C source regression test-suite from the 
interface generated from the libraries. However, we need to extend the function 
calls with the argument values to create a test-bench for the IUT. For example, 
Figure fT^a) can be generated from the model program in Figure T(cj\ If we are 



given a linked-list implementation of a finite-size integer stack, we can create 
an offline test-suite from the interface graph. The testing of the implementation 
with respect to the test-suite checks whether the interface goes to the error state 
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if and only if the implementation goes to the error state. If there is a discrepancy 
between the behavior of the interface graph and the code, we understand the 
implementation source needs further checking. 

7 Conclusions 

In this section we conclude with the summary of the work and possible future 
directions. We have provided a new algorithm for interface synthesis with a 
local-global abstraction refinement framework. This framework is can dramati- 
cally reduce the state-space of the interface generation by hiding local variables 
inside each function. The abstract summarization of the functions provides scal- 
ability. The modular analysis is used to handle each function separately. In our 
generalized setting any C-style set of functions can be handled. 

The results show that our algorithm provides a safe, permissive and suffi- 
ciently minimal (i.e. comparable to the learning algorithms) interface from the 
set of functions. We have provided the approximate abstract predecessor oper- 
ators to handle the state-space inside the function. The interface synthesis can 
be incremental : hence one can add new functions to the interface and it may 
lead to refinements corresponding to the function. 

The interface could be used to immediately verify clients and as offline test- 
suite for a new untested implementation. However, the translation engine is 
very basic and some parts are done manually. In future we like to work more 
on covering more aspects (e.g. pointers, recursive data types) of the C source 
code such that we can have bigger case studies. We like to see how we can use 
the shape analysis algorithms to translate complex data types. We also like to 
include CIL inside the tool TICC s.t. it can parse C functions and represent 
the rules directly in MDD format. We like to implement the back-end using a 
combination of MDD and SMT solvers such that the space-space problems can 
be handled better. 
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Appendix 

A C function to compute n-th Fibonacci number is translated into a set of 
guard-update rules. To handle the activation stack and store the context of the 
caller, there is an explicit implementation of integer stack. The variable nextpc 
denotes the next value of the location variable after return from one of the the 
stack operations. The variable v contains value of input parameter of push and 
is assigned before a call to push . v is the output parameter of pop and obtained 
after returns from pop. 

module Fibonacci: 

var i,s,top : [0. .MAX] 
var v: [0. .15] 

var a0, al , : [0. .15] 

var nextpc: [0..31] 
output push: { 

s=15 k top < MAX ==> top'=top+l & i'=top k s'=16; 

s=16 k i=0 ==> s'=nextpc k a0'=v; 



} 

output pop : { 

s=17 ==> i'=top k t'=18; 

s=18 & i=0 ==> s'=19 k v' = aO; 



s=19 & i>0 ==> top'=i-l k s' = nextpc 

} 

endmodule 

The rule set fib defines the transitions inside the Fibonacci function. The variable 
res stores the result when the call returns and tmpl and tmp2 are two temporary 
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variables. A recursive call to itself is translated into saving the return address, 
the current value of n, initializing n for the called function and a subsequent 
jump to the initial location of the function. 

var n : [0. . .20] 

var res, tmpl, tmp2 : [0..31] 

output fib: { 

s=0 k n<3 ==> res'=l & s'=ll; 

s=0 k n>=3 ==> s'=2; 

s=2 ==> nextpc' = 3 & s'=15 k v'=5; 

s=3 ==> nextpc' = 4 & s'=15 k v' =n; 

s=4 ==> n' = n -1 k s'=0; 

s=5 ==> t'=6 k tmpl' = res; 

s=6 ==> nextpc' = 7 k s'=15 k v'=9; 

s=7 ==> nextpc' = 8 k s'=15 k v'=n; 

s =8 ==> n'=n-2 k s'=0; 

s =9 ==> s '=10 k tmp2'= res; 

s=10 ==> s'=ll k res' = tmpl+tmp2; 

s=ll ==> nextpc' = 12 & s'=17; 

s=12 ==> n' = v k s'=13; 

s=13 ==> nextpc' = 14 & s'=15; 

s=14 ==> s' = v; 

} 
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